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In the current era, the use of corpora in language teaching is mainly 
explored in English classes as it has become a trend in education. 
Hence, this research aimed to identify the corpus metadata, 
frequently used words, and unique words related to the Islamic 
boarding school context to be used in the English instructional 
process. This research employed a mixed method combining 
quantitative and qualitative data analysis methods. Two English 
Islamic boarding school books, several articles covering the scope of 
Islamic boarding school, and students’ speech texts were selected as 
the data. Then, they were analyzed using the Voyant tool. The 
finding showed total words of 49,970: 5,417 specific words, 0.108 
vocabulary density, and a 12,980-readability index. The finding will 
be incorporated into instructional resources for developing Islamic 
boarding school students' general and/or specialized vocabulary. The 
words, in particular, will provide a foundation for students in 
constructing Islamic speech texts, delivering speeches, and using 
English in an Islamic boarding school environment. 


This is an open access article under the CC BY-SA license. 


Corresponding Author: 
Yulia Agustina 


Department of Language Education Science, Faculty of Languages, Arts, and Cultures 


Universitas Negeri Yogyakarta 
Yogyakarta, 55281, Indonesia 


Email: yulia0012pasca.2019 @student.uny.ac.id 


1. INTRODUCTION 


Using corpora in language teaching has become a trend in education and is primarily discussed in 
English instruction nowadays. A corpus (sing.) is a collection of texts, spoken or written, stored 
electronically [1]. In addition, it is a set of texts, either written or spoken, stored on a computer using an 
application compiled for particular purposes [2], [3]. Another opinion comes from McEnery and Hardie [1] 
stating that a corpus is a sizable, ethical collection of writings that is naturally occurring and is meant to be 
representative of a particular language or language variant. Hence, a corpus is a systematic collection of texts 
representing a language or language variety. As a result of its careful design, it is a helpful tool for linguistic 
analyses, language samples, language instructions, and a variety of other language-related studies and uses. 

As stated earlier, corpora (plural from corpus) can be formed by spoken or written texts. A spoken 
corpus takes considerably longer to build because speech has to be transcribed and possibly coded for some 
of its non-verbal features. On the other hand, written corpora can be made very quickly using the internet as a 
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source [2]. Building a corpus from written language requires some steps: creating a design rationale, input 
text, and database text [2]. Design rationale indicates activity in deciding what corpus is to be made and how 
many texts and sources are needed. Meanwhile, input text refers to re-type and scanned activity before 
uploading the tool. Then, database text refers to tracing the text used in making the corpus. Based on these 
considerations, this present study focused on the written language, particularly the ones made by several 
researchers who had discussed Islamic boarding school and Islamic values. 

Furthermore, operating corpora requires the tools and an internet connection to download the 
software. Some corpora can be used freely while some others commercially, such as the corpus of 
contemporary American English (COCA), AntConc, British national corpus (BNC), WordSmith tools, TIME 
magazine corpus of American English and MonoConc. According to O’keeffe et al. [2], the overview of the 
basic analytical activities in corpora includes word frequency count, concordance, and collocation. A 
concordance is a list of the terms found in the collection of texts that includes information on where and how 
frequently each word appears in the text while collocation refers to a set of two or more words that typically 
go together and constitute a natural combination of words that are closely related to one another. Moreover, 
the basic corpus linguistics techniques contain concordance, wordlists or word frequency, keyword analysis, 
and cluster analysis: i) key word in context (KWIC) of concordance refers to the method of setting up and 
presenting textual information to highlight the context in which specific keywords occur, ii) wordlist is a 
compilation or grouping of words that are typically arranged alphabetically, iii) keyword analysis is to 
identify keyword in the text; and iv) cluster analysis is the process of grouping data into coherent groups or 
clusters to find significant patterns or structures within a dataset. Those corpus linguistics techniques can be 
used based on the researchers’ needs in building the corpus. 

Additionally, teaching English using corpora can be used to create instructional materials, language 
tests, grammar exercises, classroom activities, and syllabus designs [4]. Teachers usually integrate corpus 
into their teaching in three ways: first, they get information through corpus searches; second, they produce 
materials on a level basis; and third, they have students work with these materials. Moreover, teachers can 
generate specialized corpora from authentic texts or students’ papers and then assign them to analyze the data 
[4] or use the online corpora that are already accessible to teach a particular language pattern. Leech [5] also 
elaborates on using direct corpora by mentioning teaching about, teaching to exploit, and exploiting to teach. 
According to McEnery and Xiao [6], teaching refers to the academic study of corpus linguistics and 
linguistics concepts like syntax and pragmatics. Meanwhile, teaching to exploit relates to the student's 
practical experiences and knowledge that enable them to utilize the corpora for their needs, which means that 
educational activity focuses on the students. Lastly, using the corpus-based technique to teach courses in 
sociolinguistics and discourse analysis belongs to exploiting to teach [6]. 

According to Hunston [7], a corpus benefits researchers and language learners since it effectively 
informs people what language is like. In the context of English teacher training, a corpus has great potential 
to help teachers design effective teaching activities as it provides authentic language data/examples and 
collocation learning [8]. Further, Fauzi [9] states that teaching and learning based on the corpus-based 
approach should be utilized to assist the beneficial effects of employing corpora in vocabulary instruction 
because it focuses on learning words and how to keep them in long-term memory, then use them in speaking. 
It aligns with the needs of Islamic boarding school students, who require vocabulary mastery to communicate 
and interact in English daily. Nonetheless, existing literature suggests that many students have difficulties in 
learning English vocabulary and studying the language in general [10]. As languages are built on words, 
teaching vocabulary is essential to language learning [10]. Therefore, corpus linguistics gives new insight 
into language learning, which provides help like a dictionary to find familiar, unfamiliar, and word 
arrangements. The primary argument favoring the employment of a corpus is that it is a more trustworthy 
guide to language use than native-speaker intuition [9]. 

Thus far, several researchers have created a remarkable corpus to produce specific vocabulary lists 
for particular purposes: Islamic religious studies textbooks vocabulary (IRSTV) corpus [3], nursing [11], 
English for young learners [12], medical English [13], and tourism [14], [15]. It can be concluded that the 
existing literature focused on the manufacture of the corpus. Although this present study has similar purposes 
and interests in creating a corpus, it differs from the previous ones as it concentrates on applying a unique 
word list in English language instruction in Islamic boarding school, an Islamic learning center providing 
faithful formal education and religious teaching [16]. Islamic boarding school is also defined as educational 
establishments that respect and preserve both the scientific tradition and morality of Muslims where the 
students spend most of their time; hence, they can live and learn about Islam from the cleric [16], [17]. By 
far, English instruction in Islamic boarding school had used general English similar to English instruction in 
other public schools, which did not fully consider Islamic boarding school students’ needs. It is necessary 
that Islamic boarding school students learn vocabulary related to Islamic boarding school context. Relevant 
vocabulary items will be used for more specific purposes such as constructing speech texts, delivering 
speeches on designated days, and using English in Islamic boarding school environment. Some of the 
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context-relevant words include makmum (congregation), wudlu (ablution), riya (showing off), fasting, prayer, 
and recite. Therefore, the English corpus Islamic boarding school needs to be made to ascertain the 
uniqueness of vocabulary related to the Islamic boarding school context. The research questions can be 
found: i) what is the metadata for building the Islamic boarding school English corpus?, i1) what are the most 
frequently used and unique words in the Islamic boarding school English corpus?, and iii) how are lexical 
terms and their collocation used in the Islamic boarding school context compared to Leipzig corpora?. 


2. METHOD 

The method used in this study was a multiple-method design that combined both qualitative and 
quantitative [18]. Creswell and Creswell [19] states that mixed-method research is a method used for 
conducting research involving collecting, analyzing, and integrating quantitative and qualitative data. The 
qualitative method refers to the information in textual forms which were analyzed using qualitative data 
analysis techniques. In contrast, the quantitative method refers to the numerical forms analyzed using 
quantitative data analysis techniques. It was applied to complement one another to present a clear and 
complete description to provide comprehensive explanations. 


2.1. Data collection 

The researchers collected several English books and articles explaining Islamic boarding school’s 
scope. The books were written by Solahudin [20], which described the Islamic creativity of Daarut Tauhid 
Islamic boarding school in Bandung, West Java, and Srimulyani [21], which described the women from 
traditional Islamic educational institutions in Indonesia. Then, some relevant articles were also used [22]— 
[24]. These sources were chosen because they explained the Islamic values in Islamic boarding school and 
were considered to already represent the data of the Islamic boarding school being sought. This study aimed 
to discover the frequent and unique words in the Islamic boarding school English corpus. The researchers 
used these data to establish a corpus-based English instructional model for Islamic boarding school students. 


2.2. Procedures of the study 

Corpus approach comprises three primary characteristics: i) practicality, significance, and morality, 
ii) extensive use of computer analysis, and iii) reliance on qualitative and quantitative analytical techniques 
[3], [14]. In addition, three factors must be considered while constructing a corpus: the collection must be 
ethical, contain authentic texts, and be electronically maintained [25]. To compose this corpus, the 
researchers took the steps of collecting two books and several articles related to the Islamic boarding school 
context, choosing representative texts that suit the needs, and then uploading them to the corpus tool, Voyant. 
The activities required extensive use of computers and were analyzed qualitatively and quantitatively. 


2.3. Corpus tool 

The use of corpus is always related to software analysis mediated by computers. There are many 
software analyses to examine the language data, such as Voyant, AntConc, Skect engine, MonoCon Pro, and 
WordSmith Tool. In this study, the Voyant application was chosen as the primary tool. This straightforward 
tool covered all the researchers' needs with one click. The only way to view word frequency lists, frequency 
distribution plots, KWIC displays, and unique words. is to upload the texts, which included all language data 
from the books and articles, to the device and then press enter. The initial results indicate that there are 
49,970 total words associated with the word Islamic boarding school. To narrow them down, 5,417 unique 
words are found that can be used in the analysis of teaching materials. Meanwhile, the vocabulary density is 
0.108, which can be useful as a measurement of vocabulary usage in comparison to the length of the texts. 
Finally, the readability index, with a number of 12,980, shows the estimation of how difficult the text is to 
read. The data of Islamic boarding school English corpus has been presented in Table 1. 


Table 1. Islamic boarding school English corpus 
Corpus name Total words Unique words _ Vocabulary density _ Readability index 
Islamic boarding school English corpus 49,970 5,417 0.108 12,980 


3. RESULTS AND DISCUSSION 

The research results are presented in sub-sections along with the discussion. The results are 
presented in such a way as to answer the research questions. The data presented is the authentic data taken 
from the written sources described above. 
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3.1. Corpus metadata 

Islamic boarding school English corpus was constructed from two English Islamic boarding school 
books and several articles covering the scope of Islamic boarding school. The books were chosen from 
international publishers, A New University (ANU) Press and Amsterdam University Press while the papers 
were selected from credible Indonesian journals. However, not all the texts were included in the analysis; 
only those represented the texts following the purpose of this study were selected. The first thing to do in 
building this corpus was to decide on a design rationale and how many texts and sources were needed for 
creating the corpus [2]. After collecting the required texts, the next steps were re-typing and scanning the 
written text from the sources. In converting the data that had been scanned, the researchers uploaded the 
document, and the results could be seen easily. The Table 2 consists of the corpus metadata in designing 
English corpus Islamic boarding school. 


Table 2. Corpus metadata 


No Metadata Descriptions 

1 Corpus name Islamic boarding school English corpus 

2 Corpus language English 

3 Corpus size 49,970 

4 Corpus type Specialized corpus 

5 Authors Yulia Agustina, Pratomo Widodo, Margana Margana 

6 Institution Yogyakarta State University (UNY, Universitas Negeri Yogyakarta) 

7 Aim This corpus is a representation of the language used in the Islamic boarding school context. It consists of 


several writers and researchers in the field of Islamic boarding school, such as pesantren history, The 
existence of Pondok Pesantren (Islamic boarding school), Islamic values in pesantren, pesantren role, 
changes and future of pesantren, I'tikaf and Lailatul Qodar, the Al-hajj and the Umrah, silaturrahmi, 
tausyiah, moral decadence. 

This corpus created is part of the first author's dissertation. It is intended to get any unique words related 
to the Islamic boarding school context and then use them in teaching materials as part of the model 


developed. 
8 Authors and - Solahudin [20], the workshop for morality: the Islamic creativity of Pesantren Daarat Tauhid in 
Materials’ titles Bandung, Java, 2008. 


- Srimulyani [21], women from traditional Islamic educational institutions in Indonesia negotiating 
public spaces, 2012. 
- Suhartini [22], the internalization of Islamic values in Pesantren, (December 2016). 
- Thahir [23], the role and function of Islamic boarding school: an Indonesian context, (April 2014). 
Zakaria [24], Pondok Pesantren: changes and its future, (April 2010). 
9 Publisher/website Books: ANU E Press and Amsterdam University Press. 
Articles: Journal of Islamic and Arabic Education, TAWARIKH: International Journal for Historical 
Studies, Jurnal Pendidikan Islam (Islamic educational institutions concerning Islamic education). 
10 Types Books, articles, and English speech texts 


3.2. Word list and unique words in Islamic boarding school English corpus 
3.2.1. Most frequently used words 

The most frequently used English words that are found in the Islamic boarding school English 
corpus are summarized in Table 3. It displays the estimated word calculation: 5,417 specific words, 0.108 
vocabulary density, and 12,980 readability indexes. In total, there are 50 most frequently used words for 
Islamic boarding school English corpus. They comprise the words Islamic (616), pesantren (Islamic boarding 
school) (484), school (334), boarding (278), education (275), pondok (228), and Allah SWT (191). 


3.2.2. Unique words 

This subsection consists of the findings in regard to the unique words. Unique words are defined as 
words that are present in the target texts but uncommon or absent from the other texts in a corpus. Yet, they 
can represent a rather general phenomenon [26]. The number of unique words, 50, that are spread in Islamic 
boarding school English corpus has been summarized in Table 4 (in Appendix). 

Table 4 shows that the word ‘Islamic’ is still the first unique word that occurred in the Islamic 
boarding school English corpus; this finding is similar to the one reported in Table 3. This means that the 
word pesantren can always be associated with the word ‘Islamic’. Moreover, since the words in this corpus 
are more limited than the other corpora like COCA, NBC, Corpus Mate, or Leipzig corpora, it was then 
compared with one of them, the Lepizig corpora, particularly because it contains a collection of Indonesian 
texts. This comparison also aimed at providing additional information to readers or students at the time of 
their study in the classroom. 

Nevertheless, there are four more powerful words in the English corpus Islamic boarding school 
than those in the Leipzig corpora, although the range is not much different, such as mosque (57:52), 
repentance (10:9), and preach (11:8). On the other hand, the range for the word proselytizing (20:2) is far 
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different which means that the word is equally powerful because it is more frequently used. However, when 
it comes to the reason for constructing a corpus, the quantity of the word is not a critical factor. The results of 
the most frequent words on the English corpus Islamic boarding school will be inserted later in the teaching 
materials. 


Table 3. Fifty most frequently used words in Islamic boarding school English corpus 


No Words Frequency in corpus ____ No Words Frequency in corpus 
1 Islamic 616 26 Shalat 70 
2 Pesantren (Islamic boarding school) 484 27 Muslim 69 
3 School 334 28 Modern 68 
4 Boarding 278 29 Muslims 64 
5 Education 275 30 Followers 63 
6 Pondok 228 31 Traditional 58 
7 Allah SWT 191 32 Schools 57 
8 Religious 187 33 Mosque 57 
9 Educational 155 34 Development 56 

10 Santri (students) 144 35 Society 53 
11 Institution 131 36 Java 53 
12 Kyai 130 37 Said 52 
13 Values 127 38 Community 52 
14 Islam 119 39 Technology 49 
15 Knowledge 103 40 Study 49 
16 Students 102 41 Social 49 
17 Indonesia 87 42 Quran 48 
18 Tauhid 85 43 Activities 46 
19 Tradition 78 44 Teaching 45 

20 Life 78 45 Institutions 45 

21 Good 76 46 Value 44 

22 Process 73 47 Muhammad 44 

23 People T3 48 Human 44 

24 World 70 49 Learning 43 

25 Time 70 50 Prophet 42 


3.3. Collocation 
3.3.1. Collocation of some of most frequently used words 

Collocation is a fixed or semi-fixed phrase formed by two or more words regularly occurring 
together in a particular order. O’keeffe et al. [2] state that collocation refers to three or more occurrences of 
words displayed in sentences. There are several examples of frequent word collocations, for example, 
Islamic, pesantren, and boarding. Each of these will be compared to the Leipzig corpora and elaborated. 

a. Islamic 

The word ‘Islamic’ is the most frequently used word in the Islamic boarding school English 
corpus. Some of the examples of the sentences consisting of the word ‘Islamic’ can be found in the following 
Figure 1. It can be inferred from the figure that the word ‘Islamic’ refers to nouns (N) or adjectives (Adj), 
which can be used in the beginning, middle, and end of a sentence. For example, in line 5, “... Pesantren is 
the oldest /s/amic institution growing in this country...”; this means that the word ‘Islamic’ functions as an 
adjective in the sentence. Moreover, the word ‘Islamic’ is preceded by an article or conjunction and followed 
by adjectives or nouns. To conclude, this ‘Islamic’ word functions as a noun or adjective in the Islamic 
boarding school English corpus. 

Meanwhile, the word ‘Islamic’ in the Leipzig Corpora is associated with Islamic centres, Islamic 
schools, Islamic villages, Islamic malls, Islamic preschools, and Islamic studies. The word ‘Islamic’ is 
closely related to the activities of the Islamic people. Slightly different from the English corpus Islamic 
boarding school, the word Islamic here only indicates an adjective (adj) followed by a noun (N) object. In 
detail, Figure 2 highlights how the word ‘Islamic’ is used in sentences in the Leipzig Corpora. 

b. Pesantren 

The word ‘pesantren’ can be defined as Islamic boarding schools or places of recitation activities 
[27]. When it comes to a part of speech, Figure 3 explains that pesantren is the name of a place and is 
preceded by the preposition place and conjunction. For instance, at the end of the sentence in the 10th line, 
“conducted the study at various pesantren, this paper presented result....”, it can be seen that the word 
‘pesantren’ is placed after a preposition of place, at. Moreover, the word ‘pesantren’ is usually found in the 
beginning and at the end of sentences. Figure 3 consists of examples of the word ‘pesantren’ in the 
developed corpus. 
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At the Leipzig corpora, the word ‘pesantren’ is always accompanied by pesantren Nahdlatul Ulama 
(NU), an Islamic organization in Indonesia, pesantren santri, pesantren boarding, pesantren kiai (an expert in 
Islam), pesantren school (madrasah), and Islamic pesantren. In this case, the word ‘pesantren’ can be written 
before or after another word. In grammar, this structure is called the noun phrase. In a sentence, it serves as 
the subject, the object, or the complement. A noun phrase is a collection of words that identifies or labels a 
person, place, thing, or idea [28]. For more clarity, look at Figure 4 for more sentences with the word 
pesantren according to Leipzig corpora. 
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Figure 4. The word ‘pesantren’ in Leipzig corpora 
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In more detail, the word ‘boarding’ can be preceded by an adjective, article, or link clause. To be precise, th 
examples can be found in Figure 5 in line 1, line 5, and line 6, respectively. Another finding from the English 
corpus Islamic boarding school is that the word ‘boarding’ is always followed by "school”. The examples of 
the sentences consisting of the word ‘boarding’ can be located in Figure 5. 


£; Voyant Tools 


£ Contexts 
Document Left Term Right 
+ file untu... fostering morality of santri (Islamic boarding school students) in Pesantren (Islamic 
ŒE file untu... school students) in Pesantren (Islamic boarding School) Miftahul Muhajirin Cidadap, Pagaden 
file untu... and through education in Islamic boarding school. Pesantren is the oldest 
] file untu... supported by the fact that boarding schools are the traditional educational 
© file untu... within the scope of the boarding school community can learn together 
=Œ file untu... only with the conviction that boarding schools are local agencies that 
file untu... the history of the trip boarding schools. Communities and schools like 
file untu... guided the akhlak of Islamic boarding school students of Miftahul Muhajirin 
Œ file untu... is simply understood asa boarding school which carries on the 
= file untu... educational system and system seed boarding school education, relevant done as 
E file untu... with cultural local is Islamic boarding school, yet in general, the 
file untu... knowledge in the future, Islamic boarding school is more strengthening or 
Œ file untu... be a "reflection" that Islamic boarding school is urgent to revival 
+) file untu... competitive (Mastuhu, 1999; 276), Islamic boarding school is faced by the 
& file untu... constructing the religious society. Islamic boarding school in the future is 
file untu... demand of collaboration of Islamic boarding school with favorite school is 
file untu... weakness (Muhaimin, 2009: 105). Islamic boarding school is assessed as the 


Figure 5. The word ‘boarding’ in Islamic boarding school English corpus 


J Edu & Learn, Vol. 18, No. 3, August 2024: 804-816 


J Edu & Learn ISSN: 2089-9823 Oo 811 


In contrast, the word ‘boarding’ in Leipzig corpora is mostly associated with airport activity. It is 
one of the operational processes by which passengers get into the aircraft after completing check-in 
procedures [31]. It is further proven by some possible word constructions such as boarding pass, boarding 
gate, boarding room, boarding system, boarding lounge, and boarding check-in. That being said, some 
sentences associated with Islamic boarding schools can still be found in examples 1 and 7 in the following 
Figure 6. From the examples, the word tends to be followed by a noun, and all sentences with the word 
‘boarding’ indicate to have a noun phrase structure. The examples of the word ‘boarding’ in the Leipzig 
corpora can be identified in Figure 6. Thus, it can be concluded that ‘boarding’ in Islamic boarding school 
English corpus relates to boarding activity in Islamic boarding school. Meanwhile, although it still has a 
similar meaning, such a word in Leipzig corpora is dominant for boarding activity in the airport. 


Indonesian mixed corpus based on material from 2013 with a 
boarding x ? o= 
74,329,815 sentences. @ Change corpus 


Word: boarding Number of occurrences: 2,148 Rank: 30,775 Frequency class: 14 e a Word graph 6 
v Words with Similar Context: check-in | keberangkatan | karcis | bagasi | pemberangkatan © | gate | a 
FN 
f Zt J 
pass. SPa Ho 
a Examples e i + f }. 
lounge N FAL 


* Yet, it is acknowledged that the BPK can now detect fake tickets and boarding passes. (repository.unhas.ac.\d, ” 


p. a 
P e \ | passenger 
* I support it, no need for boarding school. (abangkakak.multiply.com, collected on 19/01/2011) J, x S 


collected on 30/01/2014) 

* But for the gate, it will be printed on the boarding pass (www.wisatasingapura.web.id, collected on (System | 
03/02/2014) 

e Starting from immigration, boarding passes and so on. (mnurrikoputra.wordpress.com, collected on 
08/05/2012) 

e Your passport and boarding pass will be requested and recorded. (ambarbriastuti.muttiply.com, collected on 


08/05/2012) in 


© We waited in the boarding room (ekarosmi.wordpress.com, collected on 08/05/2012) 


check-in 


© This Islamic bo 


ng school is the forerunner of boarding schools in Indonesia. (10702486.siap-sekolah.com, 


collected on 08/05/2012) 


* After that, the boarding pass came out in my name. (mudjiarahardjo.uin-malang.ac.id, collected on schoo 
30/01/2014) 
+ Here, you also have to prepare your passport for the officer to issue a boarding pass on your behalf. 
(ip.sg.or.id, collected on 08/05/2012) 
* For example, regarding the condition of the (www.jambi-independent.co.id, collected on 04/02/2014) 


boarding room 


Figure 6. The word ‘boarding’ in Leipzig Corpora 


3.3.2. Collocation of some unique words 

There were some unique word collocations produced by the Islamic boarding school English corpus. 
Three examples of collocation of unique words that belong to Islamic boarding school English corpus and 
Leipzig corpora will be elaborated. The words include resignation, Allah Swt., and recitation. 

a. Resignation 

The word ‘resignation’ or known as tawakal is the attitude of surrendering everything to God, after 
an effort has been made, and believing that whatever happens in the world will never happen without God's 
intervention [32]. In the sentences in Figure 7, most of the word ‘resignation’ indicates a noun such as in line 
3, faith, Islam, charity, piety, sincerity, resignation, gratitude, patience, honesty, fairness, and responsibility. 
Meanwhile, in line 5, ‘resignation’ can function as a noun phrase. Despite the part of speech they represent, 
this word is not mentioned much in this corpus as can be seen in Figure 7. 

Unlike the Islamic boarding school English corpus, the word ‘resignation’ is mentioned 55 times in 
the Leipzig corpora. This word always comes with the word syndrome, letters, and addresses. More 
interestingly, the word ‘resignation’ here has three meanings according to the context: 1) refers to people’s 
guilt in the crime; ii) indicates the option for the minister; and iii) shows an official letter written by an 
employee to inform his intention to resign from a position or job in place. For more details, the use of the 
word ‘resignation’ can be seen in the following Figure 8. 

b. Allah 

“Allah” is one word that often appears in Islamic boarding school English corpus. In the doctrine of 
Islamic teaching, Allah is the creator of the universe, including humans [33] so he is known as the God for 
Muslims. In the sentence structure, He refers to a noun that a preposition or connection clause can precede. If 
the word ‘Allah’ is in the initial sentence, it will automatically be followed by a verb. The instance can be 
seen in line 14, “On this night, Allah determines the future fate of...”. This sentence reveals that the word 
‘Allah’ is followed by the verb ‘determines’. The remaining examples can be identified in Figure 9. 

On the other hand, the word ‘Allah’ in the Leipzig Corpora still appears more than it does in the 
English corpus Islamic boarding school because of the data used, 32,196,275 sentences. The word ‘Allah’ 
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concordance lines with the words: Miyetti, Almighty, Insyaa Allah, Messenger of Allah, Ansar Allah, and in 
the name of Allah. To conclude, both the English corpus Islamic boarding school and the Leipzig corpora 
describe Allah as the Lord of humanity, almighty, and the universe's creator. The words associated with 
Allah in Leipzig corpora can be seen in Figure 10. 


«y Voyant Tool 


= Contexts 


Document 
E file untu... 
file untu... 
file untu... 
H file untu... 
I| file untu... 


Resignation 


Loft 
ihsan; d) taqwa; f) Tawakal ( 
faith, Islam, charity, piety, sincerity, 
faith, islam, ihsan, taqwa, ikhlas, 
Islam value described as a 
is the Creator; with their 


Term Right 

resignation —_); g) syukur (Gratitude); h) sabar 

resignation „gratitude, patience, honesty, fairness, responsibility 
resignation , gratitude, patience, honesty, fairness, responsibility 
resignation and obedience to the rule 

resignation as a slave who should 


Figure 7. The word ‘resignation’ in Islamic boarding school English corpus 
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word: Resignation Number of occurrences: 55 Rank: 208,086 
Frequency class: 19 oe 


See also: resignation 
Part of speech: Noun 
Baseform: resignation 


Part of: Resignation Syndrome, Resignation letter 


Examples oe 


+ Resignation meant to a lot of people he was guilty of his crimes and today most see 
Nixon as one of the worst Presidents in U.S. history. (depauliaonline.com, collected 
on 27/01/2020) 

Resignation is the only option for a minister if allegations are true that the Crime Unit 
is investigating an allegation of rape against him. (antiguaobserver.com, collected on 
01/11/2020) 

+ Resignation demands come after a Facebook Live briefing on Friday where Krewson 
read the names and addresses of several residents who wrote letters to the mayor 
suggesting she defund the police department. (www.9news.com au, collected on 
29/06/2020) 

Resignation demands come after a Facebook Live briefing, where Krewson read the 
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Figure 8. The word ‘resignation’ in Leipzig corpora 
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Figure 9. The word ‘Allah’ in Islamic boarding school English corpus 
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jilah m a| ? English news corpus based on material from o 
2020 with 32,196,275 sentences. 


O Change corpi 
Word Allah Number of occurrences: 3,359 Rank: 14,261 a Word graph (i 
Frequency class: 13 e 
See also: Allāh, ALLAH, allah 


Part of speech: Proper noun 


ar Allah, I 


ah, Inshaa Allah, In s 


ah is great, Allah is th 


Subhan Allah, Alif Allah, A 


Allah Valley, Allah akbar, more 


a Words with Similar Context (i] 
Almighty (0.39), the Almighty (0.33), God (0.28) 
a Examples e 


+ Sacrifice deals with giving, with sharing those things that Allah places in your trust? 
(musiimmatters.org, collected on 09/01/2020) 
* As per Quran Allah is forgiving in this mortal life if you seek His forgiveness. 


Figure 10. The word ‘Allah’ in Leipzig corpora 


c. Recitation 

Similar to the word ‘resignation’, the word ‘recitation’ in Islamic boarding school English corpus 
also appears for the limited amount of time, 6 times. Its uses in sentences can be located in Figure 11. It can 
be inferred that the word ‘recitation’ is seldom used in Islamic boarding school English corpus. It is striking 
that this word is rarely used in Islamic boarding school activities. Instead, most people prefer to use the word 
reading the Qur’an rather than reciting the Qur’an. Unfortunately, this word of choice is not appropriate as 
reading is the practice of extracting and constructing meaning from written texts [34]. Meanwhile, the 
purpose of reciting is to acquire the necessary skills to read the Holy Quran through Arabic calligraphy by 
ensuring the correct pronunciation, compliance with the rules of recitation science, the ability to stop at 
appropriate moments, as well as the ability to absorb sounds and tone [35]. 

Moreover, the word ‘recitation’ in the Lepizig corpora is a noun with a similar meaning to the one in 
Islamic boarding school English corpus. The use of the word ‘recitation’ in Leipzig corpora is presented in 
the Figure 12. The Figure 12 shows that the word ‘recitation’ is usually related to the recitation in religious 
activities. In Islam, it is often associated with Quran recitation, Quranic recitation, letter recitation, verses 
recitation, and poem recitation. Meanwhile, in other religions, it is often used with rosary recitation and 
hanuman chalisa recitation. Therefore, for the purpose of religious activities, the word ‘recitation’ is more 
suitable when it is used to recite the Qur’an, letter, verses, or other religious books. 
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Document Left Term Right 
E file untu... because in Islam the mere recitation of the Qur'an is a 
EJ file untu... Gym believes that the mere recitation of the Qur'an itself is 
Œ file untu... shalawat as well asthe recitation of the holy Qur'an. He 
E file untu... to pengajian or the melodious recitation of Holy Qur'an. Men's eyes 
file untu... tajwid (proper pronunciation for correct recitation of the Al- Qur'an), mantiq 
+] file untu... sad voice in the shalat recitations . Aa Gym’s sad voice is 


Figure 11. The word ‘recitation’ in Islamic boarding school English corpus 


Creating an Islamic boarding school English corpus: corpus metadata, frequently used ... (Yulia Agustina) 


814 m) ISSN: 2089-9823 


recitation x B ? English news corpus based on material from o 
2020 with 32,196,275 sentences. 


@ Change corpus 


word: recitation Number of occurrences: 306 Rank: 70,697 a Word graph i] 
Frequency class: 17 e 
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Part of: Quran recitation, Quranic recitation | maulvi J 


a Examples e Í 


| Quranic Faas 

« He even started a school in his hometown to teach students the finer points of poetry Ma Á 
recitation. (www.nytimes.com, collected on 02/12/2020) - 15 

e Itis a work of hybrid literature, combining lecture, recitation, meditation and public ` 
response. (fortnightlyreview.co.uk, collected on 18/08/2020) 

+ For a brief moment during his otherwise breathless recitation of perceived A 
successes, Trump turned to the climate issue. (qz.com, collected on 21/01/2020) | Quran recitation A 

e Muslim faithful gather in their large number to listen and learn from the recitation of a 


a 
the glorious Quran. (blueprint.ng, collected on 24/04/2020) a a 
* There will be a recorded recitation, but that seems a bit sterile and disrespectful. recitation 


(pjmedia.com, collected on 11/09/2020) 


Figure 12. The word ‘recitation’ in the Leipzig corpora 


4. CONCLUSION 

Islamic boarding school English corpus is built from the work of several authors and researchers in 
the field of Islamic boarding school. It is built on 49,970 words comprising 5,417 specific words, 0.108 
vocabulary density, and a 12,980-readability index. The output of this corpus will be incorporated into 
instructional resources for developing Islamic boarding school students' general and/or specialized 
vocabulary. They can be learned and applied by students in the learning process and the daily activities in the 
Islamic boarding school environment. Furthermore, the researchers will use the data from this corpus as a 
reference in developing an English language teaching model for Islamic boarding school students. Then, the 
data will be included in the development of teaching materials. Thus, it is recommended for future 
researchers who are interested in creating a corpus to know the purpose. Researchers can use various tools to 
analyze language data and then adjust to the needs of their research. 


APPENDIX 
Table 4. Fifty unique words in Islamic boarding school English corpus 
No Words Frequency in Frequency No Words Frequency in Frequency in 
Islamic boarding in Leipzig Islamic boarding Leipzig 
school English corpora school English corpora 
corpus corpus 
1 Islamic 616 13,393 26 Saint (wali Allah) 8 259 
2 Pesantren 484 5,025 27 Proselytizing 20 2 
(taushiyah) 
3 Allah Swt. 191 69,464 28 Miraculous 10 23 
(Ma'unah) 
4 Religious 187 653 29 Miracle (mu'jizat) 15 498 
5 Santri (students) 144 3,958 30 Haughty 15 378 
(takabbur) 
6  Monotheism 85 6,005 31 Modesty 10 654 
Tauhid (tawadhu') 
7 Prayer (shalat) 70 121,937 32 Permissible (halal) 10 8,162 
8 Muslim 69 126,379 33 Forbidden (haram) 7 297 
9 Mosque 57 52 34 Recitation 20 14 
(pengajian) 
10 Quran 48 25,050 35 Recite (tadarrus) 35 227 
11 Leader (kiai) 130 25,739 36 Gossip (ghibah) 7 316 
12 Congregation 14 18 37 Steadfastness/ 10 412 
(makmum) consistency 
(istiqamah) 
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No Words Frequency in Frequency No Words Frequency in Frequency in 
Islamic boarding in Leipzig Islamic boarding Leipzig 
school English corpora school English corpora 
corpus corpus 
13 Character 78 2,537 38 Piety (taqwa) 20 14 
(akhlaq) 
14 Morality (akhlaq) 22 63 39 Trustworthiness 5 33 
(amanah) 
15 Prophet 42 58 40 Verse (ayat Quran) 40 82 
16 Faith (iman) 27 57. 41 Ablution (wudlu) 5 3 
17 Resignation 5 57 42 Intention (niat) 15 133 
(tawakal) 
18 Sincere (ikhlas) 20 56 43 Repentance 10 9 
(taubat) 
19 Courtesy (ihsan) 11 53 44 Reward/merit 12 1,227 
(pahala) 
20 Gratitude (rasa 8 53 45 Introspection 3 2 
syukur) (muhasabah) 
21 Showing off (riya) 7 1,504 46 Benefit (maslahat) 15 931 
22 Immorality 3 2 47 Teacher 41 1,158 
(maksiat) (ustadz/ustadzah) 
23 Emergency 2 84 48 Preach (khotbah) 11 8 
(madlarat) 
24 Obligation 18 829 49 Charity (shodaqoh) 5 995 
(fardlu) 
25 Initiative (ikhtiar) 5 35 50 Fasting (puasa) 13 45 
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